-
-
Notifications
You must be signed in to change notification settings - Fork 630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Enum field #2017
Add Enum field #2017
Conversation
if value is None: | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tangent: This pattern is repeated in nearly every field. Is there any reason this isn't in Field.serialize
? The complimentary None
handling is implemented in Field.deserialize
. Maybe something to tack onto #2012.
src/marshmallow/fields.py
Outdated
raise self.make_error("unknown", choices=self.choices) from exc | ||
|
||
|
||
class StringEnum(TypedEnum, String): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I was doing everything all over again, I'd leave it at just the string representation of the enum and call it a day. The only value with using an underlying numeric is that when sending to some other languages they can successfully decode an unknown value into their enum type. But that just seems like asking for subtle bugs instead of being hit with "Cannot deserialize MyEnum.FooBar15" or whatever they'd spit out. I've been away from python long enough that I don't recall what Python does off hand but if I had to guess Python barfs when attempting that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than str(value)
, which would expose the enum name, I'd rather use the member name (SymbolEnum
). Offering to serialize by value allows users to provide human-readable names as enum values, for instance, or at least decouple business logic and API layer a bit.
But thanks for the feedback. Your comment confirms my intuition that the dump/load asymmetry is too much API surface for the need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the symbol name is what I meant, just worded it kind of poorly. The asymmetry was a pain in the ass for tests and it tripped me up a few times. Pretending that they're just a union of string literals is the way to go imo
Funnily enough, I showed up here because I got notified I'm a maintainer of a critical package, which turns out is the one I question. |
Renamed Enum -> EnumSymbol, TypedEnum -> EnumValue. I'll squash later on. Docstrings still missing. |
Just tried this and it worked a treat! I'm happy with this implementation. Still requires docstrings and perhaps testing with a DateTime field or some other field for which ser/deser is less trivial to assert ser/deser actually happens. |
e96d92d
to
d77392a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As PR author, I can't request changes.
Please don't merge until I fix.
src/marshmallow/fields.py
Outdated
def __init__(self, cls_or_instance: Field | type, enum: type[Enum], **kwargs): | ||
super().__init__(**kwargs) | ||
self.enum = enum | ||
self.choices = ", ".join([str(m.value) for m in enum]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong, we need to use the field to dump the value. str
only work for trivial cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just pushed a change to let the field serialize the values.
Our API kinda sucks because serialize expects attr
and obj
parameters and we don't have those and actually I don't think they are ever used. There might be a nice opportunity for a refactor here. I opened #2039.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.field._serialize(m.value, None, None)
This won't work for fields that need attr
or obj
(but is there any ? #2039) or fields that are not symmetric (don't dump what they loaded). I'm not talking about fields that don't round-trip, I mean if several equivalent representation are accepted at load time and only one is used at dump time, this one is the one that will be displayed here and that's fine, I believe. It would be more of a problem for custom field that would purposely have asymmetric load/dump methods, like using different time formats, for instance.
Since those must be quite rare, and using them in this enum field would be even more rare, I'm fine with this limitation.
We could remove the list of accepted values from the error message but I think it is useful in the "normal" case (99% of the time) and it will be needed for documentation purpose anyway (apispec).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.field._serialize(m.value, None, None)
is already used in Mapping
.
PR fixed. I'm not 100% happy with |
Alright, moving on with this. |
Just after merging this, as I was updating the changelog to release, I realized the naming didn't follow our habits of naming fields by object type. I pushed a new PR to propose merge of both fields into an |
Would you consider adding a short migration guide for |
Replaces @orenc17's #1885.
This is a long standing feature request. See for instance discussions in #267.
The status quo is that people can use @justanr's marshmallow_enum.
Currently, this lib doesn't seem actively maintained. I don't blame the author. A few things are outdated, deprecated stuff. The changes since MA3 have been user contributed and some are still missing.
I'm thinking if we recommend this lib as the reference choice, why not integrate it in code? This is what I've been trying to do.
I started by importing the code and updating it for PY3/MA3.
I also removed the dump/load asymmetry (dump by value, load by name) that I found awkward and revamped the error message generation.
I was still not happy with the
by_value
/by_name
mechanism. I serialize by name, so I could live with only this, but I understand the need for serializing by value (to get an int or a nicer string). However, I like the type to be well-defined so that we can inherit deserialization from basic fields, and to help documenting the type (e.g. in apispec).My proposal is therefore an
Enum
field serializing by name, and typed enum fieldsStringEnum
andIntegerEnum
, to serialize by value with a given type. Users would be free to create new types but I believe those cover most cases.Transition from
marshmallow_enum
should be relatively smooth for users who don't use the dump/load asymmetry and who don't rely on exact error messages.This still lacks a bit of doc but I'd rather get feedback before polishing.
Note we can't use
OneOf
validator because validation occurs after deserialization and wrong values are caught at deserialization already.